Prologue: A machine learning sampler
نویسنده
چکیده
Y OU MAY NOT be aware of it, but chances are that you are already a regular user of machine learning technology. Most current e-mail clients incorporate algorithms to identify and filter out spam e-mail, also known as junk e-mail or unsolicited bulk e-mail. Early spam filters relied on hand-coded pattern matching techniques such as regular expressions, but it soon became apparent that this is hard to maintain and offers insufficient flexibility – after all, one person’s spam is another person’s ham!1 Additional adaptivity and flexibility is achieved by employing machine learning techniques. SpamAssassin is a widely used open-source spam filter. It calculates a score for an incoming e-mail, based on a number of built-in rules or ‘tests’ in SpamAssassin’s terminology, and adds a ‘junk’ flag and a summary report to the e-mail’s headers if the score is 5 or more. Here is an example report for an e-mail I received:
منابع مشابه
Introduction to Machine Learning & Case-Based Reasoning
Prologue These notes refer to the course of Machine Learning (course 395), Computing Department, Imperial College London. The goal of this syllabus is to summarize the basics of machine learning and to provide a detailed explanation of case-based reasoning. This chapter introduces the term " machine learning " and defines what do we mean while using this term. This is a very short summary of th...
متن کاملEffective Function Recovery for COTS Binaries using Interface Verification
Function recovery is a critical step in many binary analysis and instrumentation tasks. Existing approaches rely on commonly used function prologue patterns to recognize function starts, and possibly epilogues for the ends. However, this approach is not robust when dealing with different compilers, compiler versions, and compilation switches. Although machine learning techniques have been propo...
متن کاملA Gibbs Sampler for Learning DAGs
We propose a Gibbs sampler for structure learning in directed acyclic graph (DAG) models. The standard Markov chain Monte Carlo algorithms used for learning DAGs are random-walk Metropolis-Hastings samplers. These samplers are guaranteed to converge asymptotically but often mix slowly when exploring the large graph spaces that arise in structure learning. In each step, the sampler we propose dr...
متن کاملLearning When to Reject an Importance Sample
When observations are incomplete or data are missing, approximate inference methods based on importance sampling are often used. Unfortunately, when the target and proposal distributions are dissimilar, the sampling procedure leads to biased estimates or requires a prohibitive number of samples. Our method approximates a multivariate target distribution by sampling from an existing, sequential ...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2015